Analysis of GLDS-213 from NASA GeneLab
This R markdown file was auto-generated by the iDEP website Using iDEP 0.91, originally by Steven Xijin.Ge@sdstate.edu
Ge SX, Son EW, Yao R: iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data. BMC Bioinformatics 2018, 19(1):534. PMID:30567491
First we set up the working directory to where the files are saved.
setwd('~/Documents/HTML_R/GLDS213')
R packages and iDEP core Functions. Users can also download the iDEP_core_functions.R file. Many R packages needs to be installed first. This may take hours. Each of these packages took years to develop.So be a patient thief. Sometimes dependencies needs to be installed manually. If you are using an older version of R, and having trouble with package installation, try un-install the current version of R, delete all folders and files (C:/Program Files/R/R-3.4.3), and reinstall from scratch.
if(file.exists('iDEP_core_functions.R'))
source('iDEP_core_functions.R') else
source('https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/iDEP_core_functions.R')
We are using the downloaded gene expression file where gene IDs has been converted to Ensembl gene IDs. This is because the ID conversion database is too large to download. You can use your original file if your file uses Ensembl ID, or you do not want to use the pathway files available in iDEP (or it is not available).
inputFile <- 'GLDS213_Expression.csv'
sampleInfoFile <- 'GLDS213_Sampleinfo.csv'
gldsMetadataFile <- 'GLDS213_Metadata.csv'
geneInfoFile <- 'Arabidopsis_thaliana__athaliana_eg_gene_GeneInfo.csv' #Gene symbols, location etc.
geneSetFile <- 'Arabidopsis_thaliana__athaliana_eg_gene.db' # pathway database in SQL; can be GMT format
STRING10_speciesFile <- 'https://raw.githubusercontent.com/iDEP-SDSU/idep/master/shinyapps/idep/STRING10_species.csv'
Parameters for reading data
input_missingValue <- 'geneMedian' #Missing values imputation method
input_dataFileFormat <- 1 #1- read counts, 2 FKPM/RPKM or DNA microarray
input_minCounts <- 0.5 #Min counts
input_NminSamples <- 1 #Minimum number of samples
input_countsLogStart <- 4 #Pseudo count for log CPM
input_CountsTransform <- 1 #Methods for data transformation of counts. 1-EdgeR's logCPM 2-VST, 3-rlog
readMetadata.out <- readMetadata(gldsMetadataFile)
library(knitr) # install if needed. for showing tables with kable
library(kableExtra)
kable( readMetadata.out ) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| FLT_Cen_Rep1 | FLT_Cen_Rep2 | FLT_uG_Rep1 | FLT_uG_Rep2 | GC_1G_Rep1 | GC_1G_Rep2 | |
|---|---|---|---|---|---|---|
| Sample.LongId | Atha.Col.0.clsCC.FLT.1G.Rep1.Array | Atha.Col.0.clsCC.FLT.1G.Rep2.Array | Atha.Col.0.clsCC.FLT.uG.Rep1.Array | Atha.Col.0.clsCC.FLT.uG.Rep2.Array | Atha.Col.0.clsCC.GC.1G.Rep1.Array | Atha.Col.0.clsCC.GC.1G.Rep2.Array |
| Sample.Id | Atha.Col.0.clsCC.FLT.1G.Rep1 | Atha.Col.0.clsCC.FLT.1G.Rep2 | Atha.Col.0.clsCC.FLT.uG.Rep1 | Atha.Col.0.clsCC.FLT.uG.Rep2 | Atha.Col.0.clsCC.GC.1G.Rep1 | Atha.Col.0.clsCC.GC.1G.Rep2 |
| Sample.Name | Atha_Col-0_clsCC_FLT_1G_Rep1 | Atha_Col-0_clsCC_FLT_1G_Rep2 | Atha_Col-0_clsCC_FLT_uG_Rep1 | Atha_Col-0_clsCC_FLT_uG_Rep2 | Atha_Col-0_clsCC_GC_1G_Rep1 | Atha_Col-0_clsCC_GC_1G_Rep2 |
| GLDS | 213 | 213 | 213 | 213 | 213 | 213 |
| Accession | GLDS-213 | GLDS-213 | GLDS-213 | GLDS-213 | GLDS-213 | GLDS-213 |
| Hardware | SIMBOX centrafuge vs GC | SIMBOX centrafuge vs GC | SIMBOX centrafuge vs GC | SIMBOX centrafuge vs GC | SIMBOX centrafuge vs GC | SIMBOX centrafuge vs GC |
| Tissue | Cell cultures | Cell cultures | Cell cultures | Cell cultures | Cell cultures | Cell cultures |
| Age | 16 days | 16 days | 16 days | 16 days | 16 days | 16 days |
| Organism | Arabidopsis thaliana | Arabidopsis thaliana | Arabidopsis thaliana | Arabidopsis thaliana | Arabidopsis thaliana | Arabidopsis thaliana |
| Ecotype | Col-0 | Col-0 | Col-0 | Col-0 | Col-0 | Col-0 |
| Genotype | WT | WT | WT | WT | WT | WT |
| Variety | Col-0 WT | Col-0 WT | Col-0 WT | Col-0 WT | Col-0 WT | Col-0 WT |
| Radiation | Cosmic radiation | Cosmic radiation | Cosmic radiation | Cosmic radiation | Background Earth | Background Earth |
| Gravity | Microgravity with 1G centrafuge | Microgravity with 1G centrafuge | Microgravity | Microgravity | Terrestrial | Terrestrial |
| Developmental | 16 day old cell culture | 16 day old cell culture | 16 day old cell culture | 16 day old cell culture | 16 day old cell culture | 16 day old cell culture |
| Time.series.or.Concentration.gradient | Single time point | Single time point | Single time point | Single time point | Single time point | Single time point |
| Light | White light | White light | White light | White light | White light | White light |
| Assay..RNAseq. | Microarray Transcription Profiling | Microarray Transcription Profiling | Microarray Transcription Profiling | Microarray Transcription Profiling | Microarray Transcription Profiling | Microarray Transcription Profiling |
| Temperature | 22-24 | 22-24 | 22-24 | 22-24 | 22-24 | 22-24 |
| Treatment.type | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 | A WholeGenome Microarray Study of Arabidopsis thaliana Semisolid Callus Cultures Exposed to Microgravity and Nonmicrogravity Related Spaceflight Conditions for 5 on Board of Shenzhou 8 |
| Treatment.intensity | x | x | x | x | x | x |
| Treament.timing | x | x | x | x | x | x |
| Preservation.Method. | RNAlater | RNAlater | RNAlater | RNAlater | RNAlater | RNAlater |
readData.out <- readData(inputFile)
## Warning in DESeqDataSet(se, design = design, ignoreRank): some variables in
## design formula are characters, converting to factors
kable( head(readData.out$data) ) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| FLT_Cen_Rep1 | FLT_Cen_Rep2 | FLT_uG_Rep1 | FLT_uG_Rep2 | GC_1G_Rep1 | GC_1G_Rep2 | |
|---|---|---|---|---|---|---|
| AT1G03850 | 3.321928 | 3.321928 | 3.321928 | 3.321928 | 3.906891 | 3.906891 |
| AT5G43580 | 3.169925 | 3.000000 | 3.169925 | 3.169925 | 3.807355 | 3.807355 |
| AT1G30700 | 3.000000 | 3.000000 | 3.169925 | 3.169925 | 3.807355 | 3.700440 |
| AT3G02480 | 3.000000 | 3.169925 | 3.000000 | 3.000000 | 3.807355 | 3.584963 |
| AT5G22270 | 3.169925 | 3.169925 | 3.459432 | 3.459432 | 3.906891 | 3.807355 |
| AT3G01420 | 3.169925 | 3.000000 | 3.169925 | 3.169925 | 3.700440 | 3.584963 |
readSampleInfo.out <- readSampleInfo(sampleInfoFile)
kable( readSampleInfo.out ) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Gravity | |
|---|---|
| FLT_Cen_Rep1 | Centrafuge |
| FLT_Cen_Rep2 | Centrafuge |
| FLT_uG_Rep1 | Microgravity |
| FLT_uG_Rep2 | Microgravity |
| GC_1G_Rep1 | Terrestrial |
| GC_1G_Rep2 | Terrestrial |
input_selectOrg ="NEW"
input_selectGO <- 'GOBP' #Gene set category
input_noIDConversion = TRUE
allGeneInfo.out <- geneInfo(geneInfoFile)
converted.out = NULL
convertedData.out <- convertedData()
nGenesFilter()
## [1] "16156 genes in 6 samples. 16156 genes passed filter.\n Original gene IDs used."
convertedCounts.out <- convertedCounts() # converted counts, just for compatibility
# Read counts per library
parDefault = par()
par(mar=c(12,4,2,2))
# barplot of total read counts
x <- readData.out$rawCounts
groups = as.factor( detectGroups(colnames(x ) ) )
if(nlevels(groups)<=1 | nlevels(groups) >20 )
col1 = 'green' else
col1 = rainbow(nlevels(groups))[ groups ]
barplot( colSums(x)/1e6,
col=col1,las=3, main="Total read counts (millions)")
readCountsBias() # detecting bias in sequencing depth
## [1] 0.2359756
## [1] 0.2359756
## [1] "No bias detected"
# Box plot
x = readData.out$data
boxplot(x, las = 2, col=col1,
ylab='Transformed expression levels',
main='Distribution of transformed data')
#Density plot
par(parDefault)
## Warning in par(parDefault): graphical parameter "cin" cannot be set
## Warning in par(parDefault): graphical parameter "cra" cannot be set
## Warning in par(parDefault): graphical parameter "csi" cannot be set
## Warning in par(parDefault): graphical parameter "cxy" cannot be set
## Warning in par(parDefault): graphical parameter "din" cannot be set
## Warning in par(parDefault): graphical parameter "page" cannot be set
densityPlot()
# Scatter plot of the first two samples
plot(x[,1:2],xlab=colnames(x)[1],ylab=colnames(x)[2],
main='Scatter plot of first two samples')
####plot gene or gene family
input_selectOrg ="BestMatch"
input_geneSearch <- 'HOXA' #Gene ID for searching
genePlot()
## NULL
input_useSD <- 'FALSE' #Use standard deviation instead of standard error in error bar?
geneBarPlotError()
## NULL
# hierarchical clustering tree
x <- readData.out$data
maxGene <- apply(x,1,max)
# remove bottom 25% lowly expressed genes, which inflate the PPC
x <- x[which(maxGene > quantile(maxGene)[1] ) ,]
plot(as.dendrogram(hclust2( dist2(t(x)))), ylab="1 - Pearson C.C.", type = "rectangle")
#Correlation matrix
input_labelPCC <- TRUE #Show correlation coefficient?
correlationMatrix()
# Parameters for heatmap
input_nGenes <- 1000 #Top genes for heatmap
input_geneCentering <- TRUE #centering genes ?
input_sampleCentering <- FALSE #Center by sample?
input_geneNormalize <- FALSE #Normalize by gene?
input_sampleNormalize <- FALSE #Normalize by sample?
input_noSampleClustering <- FALSE #Use original sample order
input_heatmapCutoff <- 4 #Remove outliers beyond number of SDs
input_distFunctions <- 1 #which distant funciton to use
input_hclustFunctions <- 1 #Linkage type
input_heatColors1 <- 1 #Colors
input_selectFactorsHeatmap <- NULL #Sample coloring factors
png('heatmap.png', width = 10, height = 15, units = 'in', res = 300)
staticHeatmap()
dev.off()
## png
## 2
[heatmap] (heatmap.png)
heatmapPlotly() # interactive heatmap using Plotly
input_nGenesKNN <- 2000 #Number of genes fro k-Means
input_nClusters <- 4 #Number of clusters
maxGeneClustering = 12000
input_kmeansNormalization <- 'geneMean' #Normalization
input_KmeansReRun <- 0 #Random seed
distributionSD() #Distribution of standard deviations
KmeansNclusters() #Number of clusters
Kmeans.out = Kmeans() #Running K-means
KmeansHeatmap() #Heatmap for k-Means
#Read gene sets for enrichment analysis
sqlite <- dbDriver('SQLite')
input_selectGO3 <- NULL #Gene set category
input_minSetSize <- 15 #Min gene set size
input_maxSetSize <- 2000 #Max gene set size
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO3,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
# Alternatively, users can use their own GMT files by
#GeneSets.out <- readGMTRobust('somefile.GMT')
results <- KmeansGO() #Enrichment analysis for k-Means clusters
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
input_seedTSNE <- 0 #Random seed for t-SNE
input_colorGenes <- TRUE #Color genes in t-SNE plot?
tSNEgenePlot() #Plot genes using t-SNE
input_selectFactors <- 'Gravity' #Factor coded by color
input_selectFactors2 <- 'Sample_Name' #Factor coded by shape
input_tsneSeed2 <- 0 #Random seed for t-SNE
#PCA, MDS and t-SNE plots
PCAplot()
MDSplot()
tSNEplot()
#Read gene sets for pathway analysis using PGSEA on principal components
input_selectGO6 <- 'GOBP'
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO6,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
PCApathway() # Run PGSEA analysis
## Warning: Package 'KEGG.db' is deprecated and will be removed from Bioconductor
## version 3.12
cat( PCA2factor() ) #The correlation between PCs with factors
##
## Correlation between Principal Components (PCs) with factors
## PC1 is correlated with Gravity (p=1.48e-02).
input_CountsDEGMethod <- 2 #DESeq2= 3,limma-voom=2,limma-trend=1
input_limmaPval <- 0.1 #FDR cutoff
input_limmaFC <- 2 #Fold-change cutoff
input_selectModelComprions <- c('Gravity: Terrestrial vs. Centrafuge','Gravity: Terrestrial vs. Microgravity') #Selected comparisons
input_selectFactorsModel <- 'Gravity' #Selected comparisons
input_selectInteractions <- NULL #Selected comparisons
input_selectBlockFactorsModel <- NULL #Selected comparisons
factorReferenceLevels.out <- c('Gravity:Terrestrial')
limma.out <- limma()
## Error in if (treatments[kp] == treatments[kk]) {: missing value where TRUE/FALSE needed
DEG.data.out <- DEG.data()
## Error in DEG.data(): object 'limma.out' not found
limma.out$comparisons
## Error in eval(expr, envir, enclos): object 'limma.out' not found
input_selectComparisonsVenn = limma.out$comparisons[1:3] # use first three comparisons
## Error in eval(expr, envir, enclos): object 'limma.out' not found
input_UpDownRegulated <- FALSE #Split up and down regulated genes
vennPlot() # Venn diagram
## Error in vennPlot(): object 'limma.out' not found
sigGeneStats() # number of DEGs as figure
## Error in sigGeneStats(): object 'limma.out' not found
sigGeneStatsTable() # number of DEGs as table
## Error in sigGeneStatsTable(): object 'limma.out' not found
input_selectContrast = limma.out$comparisons[1] # use first comparisons
## Error in eval(expr, envir, enclos): object 'limma.out' not found
selectedHeatmap.data.out <- selectedHeatmap.data()
## Error in selectedHeatmap.data(): object 'limma.out' not found
selectedHeatmap() # heatmap for DEGs in selected comparison
## Error in selectedHeatmap(): object 'selectedHeatmap.data.out' not found
# Save gene lists and data into files
write.csv( selectedHeatmap.data()$genes, 'heatmap.data.csv')
## Error in selectedHeatmap.data(): object 'limma.out' not found
write.csv(DEG.data(),'DEG.data.csv' )
## Error in DEG.data(): object 'limma.out' not found
write(AllGeneListsGMT() ,'AllGeneListsGMT.gmt')
## Error in AllGeneListsGMT(): object 'limma.out' not found
input_selectGO2 <- 'GOBP' #Gene set category
geneListData.out <- geneListData()
## Error in geneListData(): object 'input_selectContrast' not found
volcanoPlot()
## Error in volcanoPlot(): object 'limma.out' not found
scatterPlot()
## Error in scatterPlot(): object 'limma.out' not found
MAplot()
## Error in MAplot(): object 'limma.out' not found
geneListGOTable.out <- geneListGOTable()
## Error in geneListGOTable(): object 'selectedHeatmap.data.out' not found
# Read pathway data again
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO2,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
input_removeRedudantSets <- TRUE #Remove highly redundant gene sets?
results <- geneListGO() #Enrichment analysis
## Error in geneListGO(): object 'geneListGOTable.out' not found
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
STRING-db API access. We need to find the taxonomy id of your species, this used by STRING. First we try to guess the ID based on iDEP’s database. Users can also skip this step and assign NCBI taxonomy id directly by findTaxonomyID.out = 10090 # mouse 10090, human 9606 etc.
STRING10_species = read.csv(STRING10_speciesFile)
ix = grep('Arabidopsis thaliana', STRING10_species$official_name )
findTaxonomyID.out <- STRING10_species[ix,1] # find taxonomyID
findTaxonomyID.out
## [1] 3702
Enrichment analysis using STRING
STRINGdb_geneList.out <- STRINGdb_geneList() #convert gene lists
## Error in STRINGdb_geneList(): object 'geneListData.out' not found
input_STRINGdbGO <- 'Process' #'Process', 'Component', 'Function', 'KEGG', 'Pfam', 'InterPro'
results <- stringDB_GO_enrichmentData() # enrichment using STRING
## Error in stringDB_GO_enrichmentData(): object 'selectedHeatmap.data.out' not found
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
PPI network retrieval and analysis
input_nGenesPPI <- 100 #Number of top genes for PPI retrieval and analysis
stringDB_network1(1) #Show PPI network
## Error in stringDB_network1(1): object 'STRINGdb_geneList.out' not found
Generating interactive PPI
write(stringDB_network_link(), 'PPI_results.html') # write results to html file
## Error in stringDB_network_link(): object 'STRINGdb_geneList.out' not found
browseURL('PPI_results.html') # open in browser
input_selectContrast1 = limma.out$comparisons[1]
## Error in eval(expr, envir, enclos): object 'limma.out' not found
#input_selectContrast1 = limma.out$comparisons[3] # manually set
input_selectGO <- 'GOBP' #Gene set category
#input_selectGO='custom' # if custom gmt file
input_minSetSize <- 15 #Min size for gene set
input_maxSetSize <- 2000 #Max size for gene set
# Read pathway data again
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
input_pathwayPvalCutoff <- 0.2 #FDR cutoff
input_nPathwayShow <- 30 #Top pathways to show
input_absoluteFold <- FALSE #Use absolute values of fold-change?
input_GenePvalCutoff <- 1 #FDR to remove genes
input_pathwayMethod = 1 # 1 GAGE
gagePathwayData.out <- gagePathwayData() # pathway analysis using GAGE
## Error in gagePathwayData(): object 'limma.out' not found
results <- gagePathwayData.out #Enrichment analysis for k-Means clusters
## Error in eval(expr, envir, enclos): object 'gagePathwayData.out' not found
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
pathwayListData.out = pathwayListData()
## Error in pathwayListData(): object 'gagePathwayData.out' not found
enrichmentPlot(pathwayListData.out, 25 )
## Error in enrichmentPlot(pathwayListData.out, 25): object 'pathwayListData.out' not found
enrichmentNetwork(pathwayListData.out )
## Error in h(simpleError(msg, call)): error in evaluating the argument 'X' in selecting a method for function 'lapply': object 'pathwayListData.out' not found
enrichmentNetworkPlotly(pathwayListData.out)
## Error in h(simpleError(msg, call)): error in evaluating the argument 'X' in selecting a method for function 'lapply': object 'pathwayListData.out' not found
input_pathwayMethod = 3 # 1 fgsea
fgseaPathwayData.out <- fgseaPathwayData() #Pathway analysis using fgsea
## Error in fgseaPathwayData(): object 'limma.out' not found
results <- fgseaPathwayData.out #Enrichment analysis for k-Means clusters
## Error in eval(expr, envir, enclos): object 'fgseaPathwayData.out' not found
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
pathwayListData.out = pathwayListData()
## Error in pathwayListData(): object 'fgseaPathwayData.out' not found
enrichmentPlot(pathwayListData.out, 25 )
## Error in enrichmentPlot(pathwayListData.out, 25): object 'pathwayListData.out' not found
enrichmentNetwork(pathwayListData.out )
## Error in h(simpleError(msg, call)): error in evaluating the argument 'X' in selecting a method for function 'lapply': object 'pathwayListData.out' not found
enrichmentNetworkPlotly(pathwayListData.out)
## Error in h(simpleError(msg, call)): error in evaluating the argument 'X' in selecting a method for function 'lapply': object 'pathwayListData.out' not found
PGSEAplot() # pathway analysis using PGSEA
## Error in PGSEAplot(): object 'input_selectContrast1' not found
input_selectContrast2 = limma.out$comparisons[1]
## Error in eval(expr, envir, enclos): object 'limma.out' not found
#input_selectContrast2 = limma.out$comparisons[3] # manually set
input_limmaPvalViz <- 0.1 #FDR to filter genes
input_limmaFCViz <- 2 #FDR to filter genes
genomePlotly() # shows fold-changes on the genome
## Error in genomePlotly(): object 'limma.out' not found
input_nGenesBiclust <- 1000 #Top genes for biclustering
input_biclustMethod <- 'BCCC()' #Method: 'BCCC', 'QUBIC', 'runibic' ...
biclustering.out = biclustering() # run analysis
input_selectBicluster <- NULL #select a cluster
biclustHeatmap() # heatmap for selected cluster
## Error in res[[i]] <- x[BicRes@RowxNumber[, number[i]], BicRes@NumberxCol[number[i], : attempt to select less than one element in integerOneIndex
input_selectGO4 = 'GOBP' # gene set category
# Read pathway data again
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO4,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
results <- geneListBclustGO() #Enrichment analysis for k-Means clusters
## Error in res[[i]] <- x[BicRes@RowxNumber[, number[i]], BicRes@NumberxCol[number[i], : attempt to select less than one element in integerOneIndex
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |
input_mySoftPower <- 5 #SoftPower to cutoff
input_nGenesNetwork <- 1000 #Number of top genes
input_minModuleSize <- 20 #Module size minimum
wgcna.out = wgcna() # run WGCNA
## Warning: executing %dopar% sequentially: no parallel backend registered
## Power SFT.R.sq slope truncated.R.sq mean.k. median.k. max.k.
## 1 1 0.82200 2.5600 0.8610 660 703.0 798
## 2 2 0.75400 1.1800 0.7730 504 543.0 685
## 3 3 0.53100 0.7050 0.4960 410 440.0 608
## 4 4 0.28100 0.3800 0.1800 345 365.0 550
## 5 5 0.12700 0.1780 0.0748 298 308.0 506
## 6 6 0.00908 0.0511 -0.1480 263 262.0 470
## 7 7 0.00713 -0.0421 -0.2080 235 226.0 441
## 8 8 0.05010 -0.1280 -0.1030 212 196.0 417
## 9 9 0.13500 -0.2330 0.0346 194 171.0 396
## 10 10 0.19600 -0.4600 -0.0208 178 149.0 378
## 11 12 0.25500 -0.5080 0.0518 154 116.0 348
## 12 14 0.32500 -0.6820 0.2000 137 92.6 324
## 13 16 0.34100 -0.7250 0.2060 123 75.8 304
## 14 18 0.36200 -0.8710 0.2060 112 63.0 288
## 15 20 0.35300 -0.9030 0.2260 104 53.3 274
## TOM calculation: adjacency..
## ..will not use multithreading.
## Fraction of slow calculations: 0.000000
## ..connectivity..
## ..matrix multiplication (system BLAS)..
## ..normalization..
## ..done.
softPower() # soft power curve
modulePlot() # plot modules
listWGCNA.Modules.out = listWGCNA.Modules() #modules
input_selectGO5 = 'GOBP' # gene set category
# Read pathway data again
GeneSets.out <-readGeneSets( geneSetFile,
convertedData.out, input_selectGO5,input_selectOrg,
c(input_minSetSize, input_maxSetSize) )
input_selectWGCNA.Module <- NULL #Select a module
input_topGenesNetwork <- 10 #SoftPower to cutoff
input_edgeThreshold <- 0.4 #Number of top genes
moduleNetwork() # show network of top genes in selected module
## Error in strsplit(input_selectWGCNA.Module, " "): non-character argument
input_removeRedudantSets <- TRUE #Remove redundant gene sets
results <- networkModuleGO() #Enrichment analysis of selected module
## Error in strsplit(input_selectWGCNA.Module, " "): non-character argument
results$adj.Pval <- format( results$adj.Pval,digits=3 )
kable( results, row.names=FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
scroll_box(width = "100%")
| Cluster | adj.Pval | Genes | Pathways |
|---|---|---|---|
| A | 1.81e-53 | 70 | Photosynthesis |
| 1.22e-44 | 162 | Response to abiotic stimulus | |
| 1.51e-32 | 137 | Response to organic substance | |
| 1.33e-31 | 122 | Response to hormone | |
| 1.42e-31 | 40 | Photosynthesis, light reaction | |
| 4.65e-31 | 122 | Response to endogenous stimulus | |
| 1.39e-29 | 76 | Response to light stimulus | |
| 1.78e-28 | 76 | Response to radiation | |
| 1.08e-26 | 108 | Cellular response to chemical stimulus | |
| 2.05e-23 | 101 | Oxidation-reduction process | |
| B | 1.08e-09 | 17 | Regulation of cell cycle process |
| 1.16e-07 | 19 | Regulation of organelle organization | |
| 1.86e-07 | 43 | Oxidation-reduction process | |
| 8.83e-07 | 14 | Regulation of mitotic cell cycle | |
| 8.83e-07 | 11 | Regulation of nuclear division | |
| 1.07e-06 | 10 | Regulation of mitotic nuclear division | |
| 1.09e-06 | 18 | Regulation of cell cycle | |
| 2.09e-06 | 52 | Phosphate-containing compound metabolic process | |
| 3.22e-06 | 52 | Phosphorus metabolic process | |
| 3.22e-06 | 20 | Regulation of cellular component organization | |
| C | 5.41e-28 | 60 | Response to external stimulus |
| 1.26e-27 | 51 | Response to external biotic stimulus | |
| 1.26e-27 | 51 | Response to other organism | |
| 1.81e-27 | 51 | Response to biotic stimulus | |
| 1.94e-25 | 52 | Defense response | |
| 4.73e-22 | 52 | Multi-organism process | |
| 2.05e-19 | 52 | Cellular response to chemical stimulus | |
| 4.83e-18 | 26 | Response to fungus | |
| 2.46e-17 | 35 | Defense response to other organism | |
| 3.99e-15 | 55 | Response to abiotic stimulus | |
| D | 1.07e-27 | 124 | Response to abiotic stimulus |
| 4.30e-22 | 109 | Response to organic substance | |
| 5.44e-21 | 94 | Response to oxygen-containing compound | |
| 5.90e-19 | 33 | Cellular response to decreased oxygen levels | |
| 5.90e-19 | 33 | Cellular response to oxygen levels | |
| 5.90e-19 | 33 | Cellular response to hypoxia | |
| 1.11e-18 | 81 | Response to external stimulus | |
| 1.98e-18 | 86 | Cellular response to chemical stimulus | |
| 1.52e-17 | 33 | Response to hypoxia | |
| 2.22e-17 | 33 | Response to decreased oxygen levels |